Short-read full-length 16S rRNA amplicon sequencing for characterisation of the respiratory bacteriome of captive and free-ranging African elephants (Loxodonta africana)

Hypervariable region sequencing of the 16S ribosomal RNA (rRNA) gene plays a critical role in microbial ecology by offering insights into bacterial communities within specific niches. While providing valuable genus-level information, its reliance on data from targeted genetic regions limits its overall utility. Recent advances in sequencing technologies have enabled characterisation of the full-length 16S rRNA gene, enhancing species-level classification. Although current short-read platforms are cost-effective and precise, they lack full-length 16S rRNA amplicon sequencing capability. This study aimed to evaluate the feasibility of a modified 150 bp paired-end full-length 16S rRNA amplicon short-read sequencing technique on the Illumina iSeq 100 and 16S rRNA amplicon assembly workflow by utilising a standard mock microbial community and subsequently performing exploratory characterisation of captive (zoo) and free-ranging African elephant (Loxodonta africana) respiratory microbiota. Our findings demonstrate that, despite generating assembled amplicons averaging 869 bp in length, this sequencing technique provides taxonomic assignments consistent with the theoretical composition of the mock community and respiratory microbiota of other mammals. Tentative bacterial signatures, potentially representing distinct respiratory tract compartments (trunk and lower respiratory tract) were visually identified, necessitating further investigation to gain deeper insights into their implication for elephant physiology and health.

roam the African continent, of which 44,326 are reported to reside within the geographical boundaries of South Africa 1,3 .The Kruger National Park is home to the largest population of wild African elephants, with an estimated count of 31,200 individual animals 3 .In addition, there are approximately 126 captive or managed African elephants reported in South Africa 4 .These elephant populations face further threats from infectious pathogens, such as elephant endotheliotropic herpesvirus (EEHV) 5 and tuberculosis (TB) [6][7][8] .The presence of disease in these populations can lead to detrimental consequences for the maintenance of a stable ecological niche 9 .These concerns highlight the need for tools that can easily and cost-effectively monitor the health of these animals to support conservation and ecosystem health.
The microbiome, which describes the distinct microbial communities that occupy the habitable compartments and niches of the host 10 , has been increasingly explored in health-related wildlife conservation efforts.Research has demonstrated that animal immune, gastrointestinal and reproductive health are intrinsically linked to the composition and corresponding functional processes of host-associated microbial communities 11 .In the context of elephant microbiome research, focus has been placed on the characterisation of African and Asian elephant intestinal microbiota in an effort to improve the health status and welfare of these animals [12][13][14][15][16][17][18][19] , restore the microbial diversity of captive elephants 12,14,15,17,19 , and enhance their diet 13,16,20 .Investigation into the respiratory microbiota may enable broad characterisation of the microbiota of healthy elephants, documentation of undescribed microbial diversity, and identification of relationships between microbiota and host disease phenotypes.Furthermore, such exploration may offer valuable insights into the broader respiratory microbial landscape of other mammals, reported as being dominated by Proteobacteria, Actinobacteriota, Firmicutes, and Bacteroidota phyla 21 , and may eventually aid in mapping pathogen transmission at human-animal interfaces.
The 16S ribosomal RNA (rRNA) gene is ubiquitous across bacterial species and is thus often used as a genetic marker for bacterial and archaeal taxonomic differentiation 22 .Within this gene are nine hypervariable regions, which, when sequenced in isolation or coupled with one or two other regions, provide a representative estimate of bacterial abundance at the genus level.However, it falls short of achieving adequate species-level taxonomic resolution, primarily due to the limited genetic information offered by the selected hypervariable regions [23][24][25] .Recent evidence has shown that investigation of full-length 16S rRNA gene sequences improves species-level taxonomic classification and minimises the taxonomic bias introduced by targeting specific hypervariable regions [23][24][25] .To overcome the limitations of short-read sequencing, innovative methods have been applied to reconstruct fulllength, or near full-length, bacterial 16S rRNA progenitor molecules, involving full-length amplicon shearing and dual-adapter amplicon ligation 26 , termini tagging 27 , or random intramolecular identifier insertion 28 .Although these approaches yield high-quality assemblies with low estimated error rates, they remain expensive and inaccessible in resource-poor settings 27,28 .
Illumina sequencing platforms have been widely adopted for short-read sequencing in microbiome research 23 .Among these platforms, the Illumina iSeq 100 benchtop instrument represents a compact and cost-effective alternative for sequencing small genomes and amplicons using a streamlined workflow requiring minimal expertise and hands-on time 29,30 .To date, only 300 cycle kits, generating maximum read lengths of 2 × 150 bp, are available for use on this instrument 29 .In targeted metagenomic sequencing applications, bacterial community characterisation on this platform has been limited to the V4 region (~ 254 bp) of the 16S rRNA gene because of the restrictive read lengths.This has also reduced the platform's ability to sequence larger regions of interest, such as the full-length 16S rRNA gene, which holds greater metataxonomic potential.
To capitalise on the platform's benefits and address the instrument's challenges with microbial community characterisation using larger genetic regions of interest, we developed a modified 150 bp paired-end short-read 16S rRNA amplicon sequencing technique and assembly pipeline tailored for the iSeq 100 platform (Fig. 1).The method involves the tagmentation and indexing of full-length 16S rRNA amplicons to ensure compatibility with the 2 × 150 bp read configuration of the instrument, and the subsequent redirection of the generated reads into a bioinformatic gene assembly workflow for reconstruction of the progenitor amplicon, following the sequencing of the prepared libraries.
Therefore, the aim of this study was to assess the feasibility of a modified 150 bp paired-end short-read 16S rRNA amplicon sequencing technique on the Illumina iSeq 100 platform and a bioinformatic gene assembly workflow using a commercial microbial community standard, and respiratory samples (bronchioalveolar lavage (BALF), endotracheal tube (ETT), and trunk wash (TW) fluid) collected from captive (zoo) and free-ranging African elephants.It was anticipated that short-read sequencing of the elephant respiratory samples would yield high-quality near full-length to full-length gene assemblies that, once taxonomically classified, would provide an initial indication of the bacterial communities present in the elephant respiratory system.These results provide an important foundation for advancing knowledge of the respiratory microbiome of African elephants and its contribution to their health.

Results
Following library preparation and sequencing of the tagmented 16S rRNA amplicon libraries on the iSeq 100 (Fig. 1a), 1.66 Gb of indexed reads were generated.A total of 68.90 ± 0.75% reads passed the sequencing chastity filter (%PF) and 88.58% had a Q-score of ≥ 30.Across the samples, the number of raw reads ranged from 157,010 to 350,656 reads, with an mean count of 228,770.83reads (SD = 52,632.71).Reads were re-assembled as per the workflow detailed in Fig. 1b.The assembled sequences ranged between 500 and 1696 bp (mean = 868.90;SD = 314.16)and comprised 72,274 to 164,302 sequences, respectively, (mean = 104,779.33;SD = 24,967.16).

Concordance with the theoretical composition of the microbial standard
The sequenced commercial microbial community displayed positive, but very weak (ρ = 0.117), concordance with the expected taxa and their abundances (Fig. 2).Seven of the eight expected bacterial species were detected in the sequenced mock microbial community, albeit in abundances differing from the theoretical composition (Fig. 2).Inspection of individual taxon abundances revealed the absence of Lactobacillus fermentum (reclassified as Limosilactobacillus fermentum) and lower relative abundances of Escherichia coli (4.1%), Pseudomonas aeruginosa (0.8%) and Salmonella enterica (4.0%) relative to the expected composition of the microbial community standard (10.1%,4.2% and 10.4%, respectively) (Fig. 2).In contrast, Enterococcus faecalis (29.7%) and  www.nature.com/scientificreports/Staphylococcus aureus (26.9%) were present in greater abundances in the sequenced mock microbial community relative to the theoretical composition (9.9% and 15.5%, respectively) (Fig. 2).

BALF samples and free-ranging elephant samples display greater bacterial diversity
A total of 12,660 amplicon sequence variants (ASVs) were identified in the dataset.The average number of ASVs across the dataset was 527.50 (SD = 338.81),with 125 and 1454 ASVs representing the minimum and maximum number of ASVs present in individual samples, respectively (Fig. 3).Rarefaction curves for the majority (22/24) of the samples reached a clear plateau, indicating that diversity was, for most samples, sufficiently captured (Fig. 3).The BALF samples had the highest average ASV abundance relative to ETT and TW samples (mean BALF = 628.1).Similarly, the median Simpson index estimates revealed lower diversity in ETT (0.020) and TW (0.020) samples compared to the BALF (0.016) samples.This was also reflected in the Simpson effective number, where the median number of species represented was notably higher in the BALF samples (64) compared to the ETT (51) and TW ( 51) samples (Table 1).
Free-ranging elephant respiratory samples had higher median Shannon index (4.68)and Shannon effective number estimates (108) compared to samples collected from captive zoo elephants (Shannon index = 4.28, Shannon effective number = 72) (Table 1).The median Simpson index estimates revealed greater diversity in samples obtained from free-ranging elephants (0.02) compared to captive zoo elephant (0.03) samples.Similar results were reflected in the median Simpson effective number estimates (Table 1).

Beta diversity
Beta diversity analyses revealed a trend towards significance for differences in diversity between the BALF, ETT and TW samples (p = 0.085, PERMANOVA) (Fig. 4a).Subsequent pairwise analyses indicated that significant differences exist between ETT and TW samples (p = 0.029, Mann-Whitney Test) before Benjamini-Hochberg correction (p adj = 0.087), while no significant differences between the other sample type pairs were identified (Supplementary Fig. S1).No noticeable or statistically significant patterns of dispersion or clustering were identified between captive and free-ranging elephants (p = 0.442, PERMANOVA) (Fig. 4b).

Elephant respiratory samples composed of four main bacterial phyla
The proportions of bacterial taxa varied across the samples.However, of the 15 bacterial phyla detected, only four (Proteobacteria, Actinobacteriota, Firmicutes, and Bacteroidota) displayed relative abundances exceeding 0.1%, collectively representing 98.5% of the total relative abundance across the samples (Table 2, Supplementary Material).and S3, Supplementary Material).The relative abundances of the top 25 bacterial genera grouped by sample type and ordered by Bray-Curtis distance showed Rothia, Enhydrobacter, Moraxella, and Chryseobacterium genera in five of the eight TW samples (Supplementary Fig. S3).Pseudomonas formed a significant proportion of five of the ten BALF samples collected from free-ranging elephants (Supplementary Fig. S3).
Visual inspection of the relative abundance of individual samples revealed a compositional pattern of Rothia amarae, Enhydrobacter aerosaccus, Moraxella osloensis, and Dermabacter hominis species similar to that of the one identified in the five TW samples at genus level (Fig. 5).

Discussion
In this study, we adapted an existing hypervariable sequencing technique to explore the feasibility of sequencing the full-length 16S rRNA gene on the Illumina iSeq 100.A Nextflow workflow was developed to streamline gene assembly and enable comprehensive characterisation of the elephant respiratory microbiome.
The application of the sequencing technique and assembly workflow to the mock microbial community and elephant respiratory samples resulted in the retrieval of high quality near full-length to full-length reads.In line with Illumina iSeq 100 i1 v2 reagent kit performance measures 29 , 88.58% of the base calls had a Q-score ≥ 30.Reads were expected to be approximately 1465 bp in length 24 ; however, the assembled reads ranged between 500 to 1696 bp and averaged 869 bp in length.This highlighted that our sequencing and gene assembly workflow was unable to consistently assemble the reads into the full-length sequences, particularly when considering studies that were able to generate 1430 bp 16S rRNA gene sequences of elephant gut bacteria using a Nanopore sequencer 31 , or those performing similar gene reconstruction techniques with alternative assemblers 32 .For the assembly step in our workflow, we implemented the MATAM software 33 .This tool has a documented tendency to overpredict and generate a number of short, fragmented scaffolds compared to other assemblers, which may serve to explain a portion of the variability in read lengths 34 .Short-read assembly algorithms also often struggle to differentiate between homologous sequences derived from conserved regions of the 16S rRNA gene, contributing to the generation of shorter assemblies 34 .The use of databases containing partial sequences of the target gene may also contribute to the generation of near full-length 16S rRNA amplicon assemblies.It is worth noting that the SILVA reference database primarily consists of high-quality near-full-length sequences 35 , as opposed to complete 16S rRNA gene sequences, which may account for some of the discrepancy in read lengths.However, despite this variation and reduced mean read length, assemblies achieved the expected concordance with the theoretical composition of the microbial community standard.
Although the association between the composition of the theoretical and sequenced mock communities was considered very weak, the discrepancies contributing to this result are likely to have arisen due to technical factors unrelated to the agreement in bacterial species identity, as seven of the eight expected species were present.Deviations in abundance from the theoretical composition may have occurred as a result of the DNA extraction method, as is likely the case with the absent gram-positive Limosilactobacillus fermentum species, or the PCR primer choice, which may contribute to a biased representation of various bacterial clades 32,36 .Interestingly, the abundances of Staphylococcus aureus, Salmonella enterica, Listeria monocytogenes, and Bacillus subtilis in the sequenced mock sample were over-represented in similar proportions to recent work contrasting the theoretical composition of the ZymoBIOMICS microbial standard with near-entire rrn 37 .This highlights the potential variability in the commercial product relative to stipulated proportions 37 , in addition to variation introduced through laboratory and bioinformatic processing.However, relative abundance is compositional in nature, meaning that the similarity of our sequenced microbial standard and the abundances of the aforementioned taxa in the nanopore-sequenced mock communities of Petrone et al. (2023), may have been coincidental and was likely caused by the over-or under-representation of the other bacterial species 37 .Taxonomic classification with the SILVA 138.1 reference database did not appear to misidentify any of the bacterial taxa present in the microbial standard, improving confidence in the species-level assignments.
The alpha diversity indices revealed differences in intra-sample diversity between the sample type and captivity status groups.BALF samples exhibited higher intra-sample diversity compared to the ETT and TW samples.Free-ranging elephant samples had alpha diversity estimates that suggested greater bacterial diversity exists in the respiratory communities of elephants in their natural environment in contrast to those in captivity.Similar reductions in bacterial diversity have been previously identified in the gut microbiome of managed (zoo) African elephants compared to their free-ranging counterparts, further highlighting the impact of the environment on microbial diversity within these animals 38 .Multidimensional Scaling (MDS) analysis of the respiratory bacterial profiles of the elephants revealed a statistical trend (p = 0.085, PERMANOVA) toward sample type divergence, based on the deviation of inter-sample diversity of the TW and ETT samples (p = 0.029, p adj = 0.087).Given that ETT samples were collected by rinsing the endotracheal tube with sterile saline following the collection of BALF from the lower respiratory tract of the elephant, a high degree of similarity in the bacterial communities of these sample types is expected.However, the segregation of the ETT from TW samples, although not statistically significant, suggests the occurrence of distinct bacterial signatures characteristic of the two respiratory tract compartments i.e., the lower respiratory tract and the trunk.www.nature.com/scientificreports/Through the application of the short-read 16S rRNA amplicon sequencing technique and gene assembly workflow to opportunistically collected samples, novel insights into the composition of the respiratory microbiota of captive (zoo) and free-ranging elephants could be described.The respiratory samples were dominated by Proteobacteria, Actinobacteriota, Firmicutes, and Bacteroidota phyla.Whilst the abundance of these phyla differs across animal species, these findings are in agreement with a recent meta-analysis of 16S rRNA gene datasets characterising the respiratory microbiota of six different domestic animals 21 .The presence of similar bacterial phyla across mammals may arise from common respiratory patterns, which could be attributed to the relatively uniform anatomy and physiology of the respiratory tracts among different mammalian species 39 .Results from the meta-analysis serve to further validate our genus-level findings, through their identification of similar common bacterial genera, particularly Streptococcus, Pseudomonas, Moraxella, and Acinetobacter, amongst others, in respiratory profiles of a wide range of animals 21 .
Interesting observations emerged from the visual assessment of prominent bacterial genera and species within the samples.Rothia, Enhydrobacter, and Moraxella species exhibited concurrent presence in five of the eight TW samples, suggesting species-specific involvement of Rothia amarae, Enhydrobacter aerosaccus, Moraxella osloensis, and Dermabacter hominis in this sample type.Rothia amarae species have been frequently isolated from environmental sources 40 .Furthermore, animal-associated Rothia species are prevalent members of the upper respiratory tract with documented abilities of exopolysaccharide synthesis for biofilm formation, complex carbohydrate metabolism, and antimicrobial compound production 40 .Enhydrobacter aerosaccus was first characterised following isolation from eutrophic lake water, but has been more recently implicated in various niches, including host-associated microbiomes (i.e.skin and blood) 41 .Moraxella species, including Moraxella osloensis, are recognised as common bacterial inhabitants of mucosal surfaces and have been isolated from the respiratory tracts of a range of mammalian hosts 42 and from multiple environmental sources 43 .Thus, the predominance of Rothia amarae, Enhydrobacter aerosaccus and Moraxella osloensis in the elephants' trunks is unsurprising, as the trunk represents a unique interface between the animal's upper respiratory tract and its environment, given its function in breathing, feeding, olfactory perception and tactile exploration 44 .
Pseudomonas was notably present and largely dominant in seven of the ten BALF samples collected from free-ranging elephants and may, therefore, represent a bacterial genus characteristic of this sample type.Of the members of the genus dominating the BALF profiles, Pseudomonas kribbensis and Pseudomonas koreensis have been previously isolated from garden 45 and agricultural soils 46 and Pseudomonas aeruginosa has been frequently implicated in respiratory disease 46 .The identification of these Pseudomonas species in elephants is not unexpected, given their direct and regular exposure to the ground, soil and dust.Furthermore, the lower respiratory tracts of elephants are not sterile as organic matter is often found deep in the airways during endoscopies (M.A. Miller, personal communication, 12 November 2023).Therefore, this finding highlights the possible novel contributions of less common environmental Pseudomonas species, potentially introduced through their interaction with the natural environment, in the lung microenvironment of elephants.On the other hand, the predominance of Pseudomonas aeruginosa in some of the profiles may suggest respiratory infection or compromised lung function in some of these animals, considering the bacterium's implication in respiratory disease.Pseudomonas aeruginosa is an opportunistic pathogen in both humans and animals and is often involved in persistent biofilm infections 47 .Studies have demonstrated that Pseudomonas aeruginosa induces a robust inflammatory response which, in the context of trained immunity, may play a beneficial role in modulating innate and adaptive immunity, enhancing the host's response towards other microorganisms, including mycobacteria.
Several limitations exist within the current study.In addition to the aforementioned biases introduced by PCR primers and the reference databases used for taxonomic classification, bacterial community structure represented using targeted metagenomic sequencing methods are subject to further inaccuracies.However, this issue is not unique to the current study, as it affects most microbiome studies that target the 16S rRNA gene.Other common limitations include confounding factors that influence the presence and abundance of microbiota.For instance, in the current study, variables including, but not limited to, the sampling season, age, sex, and infection status of the elephants may have influenced the respiratory microbial composition.However, sample collection could not be planned, and was instead performed opportunistically when elephants were immobilised for routine management procedures.Moreover, whilst veterinarian-assisted TW samples could be collected from free-ranging elephants during immobilisation, captive elephants were not trained to provide TW samples and thus, were not collected and included in this study.This has led to limited sample size and uneven sample type distribution, which has hindered the ability to conduct statistically sound analyses.Sample type and captivity status also present as confounding variables of each other which may limit the ability to make any definitive conclusions.Another limitation pertains to the transportation and storage of the samples at − 20 °C.The gold standard is flash-freezing or storage at -80 °C48 ; however, this presents a challenge when collecting samples in remote areas, given the limited access to the required equipment.All samples were, therefore, consistently stored in the available − 20 °C equipment and facilities.While we acknowledge the limitations hereof, we anticipate that these storage conditions have not significantly biased bacterial abundances, considering the results of a recent study that showed that unpreserved saliva samples stored at − 15 °C maintained their microbial composition and integrity 49 .
Despite these challenges, this exploratory study has provided the first glimpse into the respiratory bacterial communities harboured within these animals at species level using a short-read full-length 16S rRNA amplicon sequencing technique and assembly workflow.Future studies should consider refining the sequencing technique and microbiome-related analysis workflow.Selection of better suited primer sets, use of mock communities with representation of bacterial clades characteristic of the microbial niche of interest 48 , curated reference databases for taxonomic classification, and well-maintained 16S rRNA amplicon assembly programmes may contribute to improving species-level taxonomic assignments and characterisation of bacterial communities.In the context of the elephant respiratory microbiota, a larger sample size may allow for the identification of novel microbial

Conclusion
The results presented in the current study demonstrate the viability of the adapted sequencing technique for high-quality partial to near full-length short-read 16S rRNA gene sequencing on the Illumina iSeq 100 platform.Application of the technique to mock microbial community standards and African elephant respiratory samples enabled accurate taxonomic classification of bacteria at genus and species level.It is essential to exercise caution when interpreting these findings, given the limited sample size and the multitude of potentially influential external factors.Further investigation into the respiratory microbiota of free-ranging and captive elephants is warranted to validate the tentative visually-identified bacterial patterns.

Specimen collection
A total of 17 African elephants were sampled, four of which were captive zoo elephants and 13 of which were freeranging individuals (Supplementary Fig. S5).From these, 12 BALF samples, three ETT samples and eight TW samples were opportunistically collected from immobilized elephants during routine management procedures using previously reported methods [50][51][52] .The samples described were not primarily collected for microbiome analysis, but rather acquired as part of an approved project to evaluate respiratory samples using mycobacterial culture.Approximately 150 mL of each respiratory sample was collected in a 500 mL sterile suction vacuum container., alongside sample-specific metadata (Fig. 1b).Briefly, following input into the 16S-amplicon-seq workflow, raw fastq files were directed to FastQC 58 (version 0.11.9) to generate summary reports detailing basic read and sequence-related statistics (i.e., read count, sequence length, percentage GC content, quality scores and adapter content) for each sample.Reads were trimmed using Trim Galore 59 (version 0.6.5)with the following parameters: -q 30, -length 50, -clip_R1 and -clip_R2 flags, to retain bases with quality scores ≥ 30 and reads with a minimum length of 50 bases, and ensure 5′ region of the sequences were removed to account for reduced read quality.Forward and reverse reads were concatenated to ensure compliance with the MATAM 33 (version 1.6.0)gene assembly algorithm's input requirements.The MATAM assembly module was executed using default parameters.Sequence abundance estimates, required for ASV table generation, were extracted from MATAM's output.Taxonomic classification was performed on the resultant assembly files using the SILVA 138.1 prokaryotic SSU taxonomic training dataset formatted for DADA2 (silva_nr99_v138.1_wSpecies_train_set.fa.gz) 35 by implementing the RDP Naïve Bayesian Classifier algorithm 60 .Taxonomy tables and abundance files were integrated into sample-specific amplicon sequence variant (ASV) tables.Lastly, each newly generated ASV and taxonomy table was merged to create a single phyloseq-compatible ASV and taxonomy comma separated value (csv) file for the processed samples.
The phyloseq 56 tax_glom function is unable to agglomerate taxa with identical names at lower taxonomic levels with differing taxonomy at higher ranks, which impacts its ability to collapse bacterial taxa to species-level rank.To account for this, taxonomy tables were redirected into an automated taxonomy renaming tool (https:// github.com/ laure ncmar tin/ taxa-table-tool) that denominates the input to ensure unique species labels for all genera that share the same species names 55 .A denominated taxonomy table was then re-imported into R (version 4.3.1)for diversity analyses using scripts contained in the Rhea 61 pipeline and regeneration of an updated phyloseq object for subsequent agglomeration of the taxa at genus and species level and relative abundance visualisation.

Mock microbial community comparison
The comparison of the mock community profile to the theoretical composition of the ZymoBIOMICS™ Microbial Community Standard was performed using the checkZymoBiomics function of the chkmocks package 36 in R. The checkZymoBiomics function uses FASTA files containing full-length 16S rRNA gene sequences of the expected microbes in the commercial standard for the taxonomic classification of the ASVs present in the mock community of interest.It then assesses the degree of similarity between the relative taxon abundances of the theoretical composition of the standard reported by the supplier, derived using the formula: (total genomic DNA (g) × unit conversion constant (bp/g)/genome size (bp)) × 16S/18S copy number per genome), against the composition of mock community with a Spearman's rank correlation test 36 .The sequenced mock and theoretical composition were visualised as a stacked bar plot using the plotZymoDefault function of the same package.

Diversity analyses
Bacterial composition and diversity analyses at phylum, genus, and species level were conducted in R. The ASV table was normalised using Total Sum Scaling (TSS) procedures, as scripted for in the Rhea pipeline, whereby each sample's counts were divided by their sample size and subsequently scaled up by the smallest sample's size across the dataset 61 .This scaling method was implemented to minimise data loss or the introduction of variance to the data, which commonly occurs when rarefying (randomly sub-sampling reads to a specific sequencing depth) 61 .Rarefaction curves were generated, using the same script, to assess whether the sequencing depth was adequate to capture the diversity within the samples.An abundance filtering cut-off of 0.10% was applied to the dataset to prevent downstream analysis of spurious reads 61 .Alpha and beta diversity analyses were performed with scripts supplied in the Rhea pipeline 61 .Shannon, effective Shannon, Simpson and effective Simpson alpha diversity indices were used to assess intra-sample variation and Multidimensional (MDS) plots of generalized UniFrac distances were used to visualise the distribution of the samples with reference to their phylogenetic relatedness to evaluate beta diversity.Assessment of differences in beta diversity centroids and dispersion among the studied groups was performed using a Permutational Multivariate Analysis of Variance (PERMANOVA) by employing the adonis2 function nested within the vegan 62 package.Significance was defined as p < 0.05.

Taxon abundances
Following generation of a phyloseq object from the normalised ASV table, the denominated taxa table and metadata file, normalised ASV counts were transformed into proportions and visualisation of the top 25 abundant bacterial genera and species across samples using facet-wrapped stacked bar charts was performed.The proportional average abundances and standard deviations of the top phyla, genera, and species across groups were obtained using the get_abundances function from the microbiomeutilities 63

Figure 1 .
Figure 1.Overview of the library preparation, sequencing, and 16S rRNA amplicon re-assembly.Samples are subject to (a) DNA extraction, library preparation, and 150 bp paired-end short-read sequencing on the Illumina iSeq 100, before (b) amplicon reconstruction using the 16S-amplicon-seq workflow and microbiome diversity and composition analysis.

Figure 2 .
Figure 2. Representation of the concordance between the theoretical composition of the commercial microbial standard and microbial standard sequenced and assembled using the adapted 150 bp paired-end full-length 16S rRNA gene sequencing technique and 16S-amplicon-seq workflow.Horizontal stacked bar charts depicting the composition of sequenced microbial standard and theoretical composition of ZymoBIOMICS™ Microbial Community Standard.A very weak concordance (ρ = 0.117) between the commercial standard and the sequenced mock community was found.

Figure 3 .
Figure 3. Rarefaction curves of elephant respiratory samples delineated according to sample type and captivity status.The horizontal axis shows the total number of assembled reads.The vertical axis depicts the number of ASVs observed.The majority of the samples (22/24) achieved a plateau, indicating that diversity was sufficiently captured.BALF bronchoalveolar lavage fluid, ETT endotracheal tube wash, TW trunk wash.

Figure 4 .
Figure 4. Multidimensional scaling (MDS) plot of generalized Unifrac dissimilarities derived from the bacterial communities of BALF, ETT, and TW respiratory samples of captive (zoo) and free-ranging elephants.Plots are segregated by (a) sample type and (b) captivity status.Intersample variation between sample type groups trended toward significance (p = 0.085, PERMANOVA).Pairwise comparison revealed near-significant differences in the beta diversity of ETT and TW (p = 0.029, p adj = 0.087).There were no significant differences between free-ranging (National Park) and captive (Zoo) elephants (p = 0.442, PERMANOVA).BALF bronchoalveolar lavage fluid, ETT endotracheal tube wash, TW trunk wash, NP National Park.

Figure 5 .
Figure 5. Top 25 most abundant bacterial species present in BALF, ETT, and TW samples collected from captive (zoo) and free-ranging elephants.Relative abundances of these species are displayed with stacked bar charts stratified by sample type.Samples are ordered according to Bray-Curtis dissimilarity.BALF bronchoalveolar lavage fluid, ETT endotracheal tube wash, TW trunk wash.

Table 1 .
Shannon , Shannon effective number, Simpson and Simpson effective number alpha diversity estimates for sample types and captivity status groups.BALF bronchoalveolar lavage fluid, ETT endotracheal tube wash, TW trunk wash, IQR inter-quartile range.

Table 2 .
Relative abundance (%) of prominent phyla in the bacterial profiles of the respiratory samples stratified by sample type and captivity status.BALF bronchoalveolar lavage fluid, ETT endotracheal tube wash, TW trunk wash.